A discriminative decision tree learning approach to acoustic modeling
نویسندگان
چکیده
The decision tree is a popular method to accomplish tying of the states of a set context dependent phone HMMs for efficient and effective training of the large acoustic models. A likelihood-based impurity function is commonly adopted. It is well known that maximizing likelihood does not result in the maximal separation between the distributions in the leaves of the tree. To improve robustness, a discriminative decision tree learning approach is proposed. It embeds the MCE-GPD formulation in defining the impurity function so that the discriminative information could be taken into account while optimizing the tree. We compare the proposed approach with the conventional tree building using a Mandarin syllable recognition task. Our preliminary results show that the separation between the divided subspaces in the tree nodes is clearly enhanced although there is a slight performance reduction.
منابع مشابه
A discriminative splitting criterion for phonetic decision trees
Phonetic decision trees are a key concept in acoustic modeling for large vocabulary continuous speech recognition. Although discriminative training has become a major line of research in speech recognition and all state-of-the-art acoustic models are trained discriminatively, the conventional phonetic decision tree approach still relies on the maximum likelihood principle. In this paper we deve...
متن کاملMMDT: Multi-Objective Memetic Rule Learning from Decision Tree
In this article, a Multi-Objective Memetic Algorithm (MA) for rule learning is proposed. Prediction accuracy and interpretation are two measures that conflict with each other. In this approach, we consider accuracy and interpretation of rules sets. Additionally, individual classifiers face other problems such as huge sizes, high dimensionality and imbalance classes’ distribution data sets. This...
متن کاملACID/HNN: clustering hierarchies of neural networks for context-dependent connectionist acoustic modeling
We present the ACID/HNN framework, a principled approach to hierarchical connectionist acoustic modeling in large vocabulary conversational speech recognition (LVCSR). Our approach consists of an Agglomerative Clustering algorithm based on Information Divergence (ACID) to automatically design and robustly estimate Hierarchies of Neural Networks (HNN) for arbitrarily large sets of context-depend...
متن کاملAcoustic Modeling in Statistical Parametric Speech Synthesis – from Hmm to Lstm-rnn
Statistical parametric speech synthesis (SPSS) combines an acoustic model and a vocoder to render speech given a text. Typically decision tree-clustered context-dependent hidden Markov models (HMMs) are employed as the acoustic model, which represent a relationship between linguistic and acoustic features. Recently, artificial neural network-based acoustic models, such as deep neural networks, ...
متن کاملAutomatic pronunciation error detection: an acoustic-phonetic approach
In this paper, we present an acoustic-phonetic approach to automatic pronunciation error detection. Classifiers using techniques such as Linear Discriminant Analysis or a decision tree were developed for three sounds that are frequently pronounced incorrectly by L2-learners of Dutch: /A/, /Y/ and /x/. The acoustic properties of these pronunciation errors were examined so as to define a number o...
متن کامل